Search CORE

22 research outputs found

BUT CHiME-7 system description

Author: Barchi Germán
Beneš Karel
Karafiát Martin
Mošner Ladislav
Pepino Leonardo
Szöke Igor
Veselý Karel
Witkowski Marcin
Publication venue
Publication date: 18/10/2023
Field of study

This paper describes the joint effort of Brno University of Technology (BUT), AGH University of Krakow and University of Buenos Aires on the development of Automatic Speech Recognition systems for the CHiME-7 Challenge. We train and evaluate various end-to-end models with several toolkits. We heavily relied on Guided Source Separation (GSS) to convert multi-channel audio to single channel. The ASR is leveraging speech representations from models pre-trained by self-supervised learning, and we do a fusion of several ASR systems. In addition, we modified external data from the LibriSpeech corpus to become a close domain and added it to the training. Our efforts were focused on the far-field acoustic robustness sub-track of Task 1 - Distant Automatic Speech Recognition (DASR), our systems use oracle segmentation.Comment: 6 pages, Chime-7 challenge 202

arXiv.org e-Print Archive

ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

Author: Cevenini Claudia
Choukri Khalid
Kocour Martin
Kolčárek Pavel
Motlicek Petr
Nigmatulina Iuliia
Prasad Amrutha
Rigault Mickael
Sarfjoo Seyyed Saeed
Szöke Igor
Tart Allan
Veselý Karel
Zuluaga-Gomez Juan
Černocký Jan
Publication venue
Publication date: 08/11/2022
Field of study

Personal assistants, automatic speech recognizers and dialogue understanding systems are becoming more critical in our interconnected digital world. A clear example is air traffic control (ATC) communications. ATC aims at guiding aircraft and controlling the airspace in a safe and optimal manner. These voice-based dialogues are carried between an air traffic controller (ATCO) and pilots via very-high frequency radio channels. In order to incorporate these novel technologies into ATC (low-resource domain), large-scale annotated datasets are required to develop the data-driven AI systems. Two examples are automatic speech recognition (ASR) and natural language understanding (NLU). In this paper, we introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging ATC field, which has lagged behind due to lack of annotated data. The ATCO2 corpus covers 1) data collection and pre-processing, 2) pseudo-annotations of speech data, and 3) extraction of ATC-related named entities. The ATCO2 corpus is split into three subsets. 1) ATCO2-test-set corpus contains 4 hours of ATC speech with manual transcripts and a subset with gold annotations for named-entity recognition (callsign, command, value). 2) The ATCO2-PL-set corpus consists of 5281 hours of unlabeled ATC data enriched with automatic transcripts from an in-domain speech recognizer, contextual information, speaker turn information, signal-to-noise ratio estimate and English language detection score per sample. Both available for purchase through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. 3) The ATCO2-test-set-1h corpus is a one-hour subset from the original test set corpus, that we are offering for free at https://www.atco2.org/data. We expect the ATCO2 corpus will foster research on robust ASR and NLU not only in the field of ATC communications but also in the general research community.Comment: Manuscript under review; The code will be available at https://github.com/idiap/atco2-corpu

arXiv.org e-Print Archive

Fusarium: more than a node or a foot-shaped basal cell

Author: Abdel-Azeem M. A.
Abdollahzadeh J.
Abdolrasouli Alireza
Akulov A.
Alberts Johanna F.
Araújo João P.M.
Ariyawansa Hiran A.S.
Bakhshi Mounes
Ben Hadj Amor A.
Bensch Konstanze
Bezerra J. D. P.
Boekhout T.
Cai L.
Carbia Mauricio
Cardinali Gianluigi
Castañeda-Ruiz R. F.
Celis-Ramírez Adriana Marcela
Chaturvedi Vishnu
Chaverri Priscila
Collémare Jérôme
Croll Daniel
Crous P. W.
Câmara M. P. S.
Damm Ulrike
Decock Cony A.
Ezekiel Chibundu N.
Fan X. L.
Fernández Norma Beatriz
Frisvad Jens Christian
Gaya Ester
Gené J.
González Cristian D.
Gramaje David
Groenewald J.Z.
Grube Martin
Guarnaccia V.
Guarro J.
Guevara-Suarez Marcela I.
Gupta Vijai Kumar
Haddaji A.
Haelewaters Danny
Hagen Ferry
Hansen Karen
Hashimoto A.
Hernández-Restrepo Margarita
Hirooka Y.
Houbraken Jos A. M. P.
Hubka Vít
Hyde K.D.
Iturriaga Teresa M.
Jeewon Rajesh
Johnston Peter R.
Jurjević Željko
Karaltı İskender
Kema Gerrit H. J.
Korsten L.
Kuramae Eiko Eurya
Kušan Ivana
Labuda Román
Lamprecht Sandra Christina
Lawrence Daniel P.
Lechat Christian
Lee Hyangburm
Li Hongye
Litovka Yulia A.
Lombard L.
Maharachchikumbura Sajeewa S. N.
Marín-Felix Yasmina
Matio Kemkuignou B.
Matočec Neven
McTaggart Alistair R.
Merwe N. A. van der
Messias Rodrigues Anderson
Mlčoch Patrik
Mugnai Laura
Nakashima Chiharu
Nilsson R. H.
Noumeur Sara Raouia
Pavlov Igor N.
Peralta M. P.
Phillips Alan J. L.
Pitt John I.
Ploch Sebastian
Polizzi Giancarlo
Quaedvlieg William
Rajeshkumar Kunhiraman C.
Restrepo Silvia
Rhaïem Azza
Robert J.
Robert Vincent A. R. G.
Rossman Amy Y.
Salgado-Salazar Catalina
Samson Robert A.
Sandoval-Denis M.
Santos Ariana C. S.
Schroers H. J.
Seifert K. A.
Shivas Roger G.
Souza-Motta C. M.
Stadler Marc
Summerbell Richard C.
Sun G. Y.
Swart Wijnand J.
Szöke Szániszló
Tan Yupei
Taylor Joanne E.
Taylor John W.
Taylor P. W. J.
Thines Marco
Verkley Gerard J. M.
Vieira Tiago Patricia
Vieira Willie A. S.
Visagie C. M.
Vizzini Alfredo
Vries Ronald P. de
Váczy Kálmán Zoltán
Weir Bevan S.
Wiele Nathalie van de
Wijayawardene Nalin N.
Xia J. W.
Yilmaz N.
Yurkov Andrey M.
Yáñez-Morales María de Jesús
Zamora Juan Carlos
Zare Rasoul
Zhang C. L.
Publication venue: 'Elsevier BV'
Publication date: 24/02/2022
Field of study

Recent publications have argued that there are potentially serious consequences for researchers in recognising distinct genera in the terminal fusarioid clade of the family Nectriaceae. Thus, an alternate hypothesis, namely a very broad concept of the genus Fusarium was proposed. In doing so, however, a significant body of data that supports distinct genera in Nectriaceae based on morphology, biology, and phylogeny is disregarded. A DNA phylogeny based on 19 orthologous protein-coding genes was presented to support a very broad concept of Fusarium at the F1 node in Nectriaceae. Here, we demonstrate that re-analyses of this dataset show that all 19 genes support the F3 node that represents Fusarium sensu stricto as defined by F. sambucinum (sexual morph synonym Gibberella pulicaris). The backbone of the phylogeny is resolved by the concatenated alignment, but only six of the 19 genes fully support the F1 node, representing the broad circumscription of Fusarium. Furthermore, a re-analysis of the concatenated dataset revealed alternate topologies in different phylogenetic algorithms, highlighting the deep divergence and unresolved placement of various Nectriaceae lineages proposed as members of Fusarium. Species of Fusarium s. str. are characterised by Gibberella sexual morphs, asexual morphs with thin- or thick-walled macroconidia that have variously shaped apical and basal cells, and trichothecene mycotoxin production, which separates them from other fusarioid genera. Here we show that the Wollenweber concept of Fusarium presently accounts for 20 segregate genera with clear-cut synapomorphic traits, and that fusarioid macroconidia represent a character that has been gained or lost multiple times throughout Nectriaceae. Thus, the very broad circumscription of Fusarium is blurry and without apparent synapomorphies, and does not include all genera with fusarium-like macroconidia, which are spread throughout Nectriaceae (e.g., Cosmosporella, Macroconia, Microcera). In this study four new genera are introduced, along with 18 new species and 16 new combinations. These names convey information about relationships, morphology, and ecological preference that would otherwise be lost in a broader definition of Fusarium. To assist users to correctly identify fusarioid genera and species, we introduce a new online identification database, Fusarioid-ID, accessible at www.fusarium.org. The database comprises partial sequences from multiple genes commonly used to identify fusarioid taxa (act1, CaM, his3, rpb1, rpb2, tef1, tub2, ITS, and LSU). In this paper, we also present a nomenclator of names that have been introduced in Fusarium up to January 2021 as well as their current status, types, and diagnostic DNA barcode data. In this study, researchers from 46 countries, representing taxonomists, plant pathologists, medical mycologists, quarantine officials, regulatory agencies, and students, strongly support the application and use of a more precisely delimited Fusarium (= Gibberella) concept to accommodate taxa from the robust monophyletic node F3 on the basis of a well-defined and unique combination of morphological and biochemical features. This F3 node includes, among others, species of the F. fujikuroi, F. incarnatum-equiseti, F. oxysporum, and F. sambucinum species complexes, but not species of Bisifusarium [F. dimerum species complex (SC)], Cyanonectria (F. buxicola SC), Geejayessia (F. staphyleae SC), Neocosmospora (F. solani SC) or Rectifusarium (F. ventricosum SC). The present study represents the first step to generating a new online monograph of Fusarium and allied fusarioid genera (www.fusarium.org)

Digital.CSIC

Interaction Testing and Polygenic Risk Scoring to Estimate the Association of Common Genetic Variants with Treatment Resistance in Schizophrenia

Author: Adolfsson Rolf
Agartz Ingrid
Agbedjro Deborah
Agerbo Esben
Ajnakina Olesya
Alameda Luis
Albus Margot
Alexander Madeline
Amin Farooq
Andreassen Ole A.
Bacanu Silviu A.
Barnes Thomas R. E.
Baune Bernhard T.
Begemann Martin
Belliveau Richard A .
Bene Judit
Berardi Domenico
Bergen Sarah E.
Bevilacqua Elizabeth
Bigdeli Tim B.
Black Donald W.
Blackwood Douglas H. R.
Bonora Elena
Bramon Elvira
Bruggeman Richard
Buccola Nancy G.
Buckner Randy L.
Bulik-Sullivan Brendan
Buxbaum Joseph D.
Byerley William
Børglum Anders D.
Cahn Wiepke
Cai Guiqing
Cairns Murray J.
Campion Dominique
Camporesi Sara
Cantor Rita M.
Carr Vaughan J.
Carrera Noa
Catts Stanley V.
Chambert Kimberly D.
Chan Raymond C. K.
Chen Eric Y. H.
Chen Ronald Y. L.
Cheng Wei
Cheung Eric F. C.
Chong Siow Ann
Cichon Sven
Clair David St
Cleusix Martine
Cloninger C. Robert
Cohen David
Cohen Nadine
Collier David A.
Conus Philippe
Cormican Paul
Corvin Aiden
Craddock Nick
Crespo-Facorro Benedicto
Crowley James J.
Curtis David
Daly Mark J.
Darvasi Ariel
Davidson Michael
Davis Kenneth L.
Degenhardt Franziska
DeLisi Lynn E.
Demjaha Arsime
Demontis Ditte
Dennison Charlotte A.
Di Forti Marta
Dikeos Dimitris
Dinan Timothy
Djurovic Srdjan
Do Kim Q.
Domenici Enrico
Donohoe Gary
Doody Gillian A.
Drapeau Elodie
Duan Jubao
Dudbridge Frank
Durmishi Naser
D’Andrea Giuseppe
Eap Chin B.
Ehrenreich Hannelore
Eichhammer Peter
Eriksson Johan
Escott-Price Valentina
Esko Tõnu
Essioux Laurent
Fanous Ayman H.
Farh Kai-How
Farrell Martilias S.
Favero Jurgen Del
Ferchiou Aziz
Frank Josef
Franke Lude
Freedman Robert
Freimer Nelson B.
Friedl Marion
Friedman Joseph I.
Fromer Menachem
Gejman Pablo V.
Genovese Giulio
Georgieva Lyudmila
Gershon Elliot S.
Giegling Ina
Gill Michael
Giusti-Rodríguez Paola
Godard Stephanie
Goldstein Jacqueline I.
Golimbet Vera
Gopal Srihari
Gratten Jacob
Guidi Lorenzo
Gurling Hugh
Haan Lieuwe de
Hammer Christian
Hamshere Marian L.
Hansen Mark
Hansen Thomas
Haroutunian Vahram
Hartmann Annette M.
Henskens Frans A.
Herms Stefan
Hirschhorn Joel N.
Hoffmann Per
Hofman Andrea
Hollegaard Mads V.
Holmans Peter A.
Homman Lina
Hougaard David M.
Huang Hailiang
Hultman Christina M.
Ikeda Masashi
Iwata Nakao
Jablensky Assen V.
Jenni Raoul
Joa Inge
Joyce Eileen M.
Julià Antonio
Jönsson Erik G.
Kahn René S.
Kalaydjieva Luba
Kapur Shitij
Karachanak-Yankova Sena
Karjalainen Juha
Kassoumeri Laura
Kavanagh David
Keller Matthew C.
Kelly Brian
Kendler Kenneth S.
Kennedy James L.
Keong Jimmy Lee Chee
Kepinska Adrianna
Khadimallah Inès
Khrunin Andrey
Kim Yunjung
Kirov George
Klovins Janis
Knight Jo
Knowles James A.
Konte Bettina
Kowalec Kaarina
Kravariti Eugenia
Kucinskas Vaidutis
Kucinskiene Zita Ausrele
Kuzelova-Ptackova Hana
Kähler Anna K.
Lastrina Ornella
Laurent Claudine
Lee Phil
Lee S. Hong
Legge Sophie E.
Lencz Todd
Lerer Bernard
Levinson Douglas F.
Li Miaoxin
Li Qingqin S.
Li Tao
Liang Kung-Yee
Lieberman Jeffrey
Limborska Svetlana
Liu Jianjun
Loughland Carmel M.
Lubinski Jan
Lynham Amy J.
Lönnqvist Jouko
MacCabe James H.
Macek Milan
Magnusson Patrik K. E.
Maher Brion S.
Maier Wolfgang
Malhotra Anil K.
Mallet Jacques
Marsal Sara
Mattheisen Manuel
Mattingsdal Morten
McCarley Robert W.
McCarroll Steven A.
McDonald Colm
McIntosh Andrew M.
McQuillin Andrew
Meier Sandra
Meijer Carin J.
Melegh Bela
Melle Ingrid
Melle Ingrid
Mesholam-Gately Raquelle I.
Metspalu Andres
Michie Patricia T.
Milani Lili
Milanova Vihra
Millgate Edward
Mokrab Younes
Moran Jennifer L.
Morris Derek W.
Mors Ole
Mortensen Preben B.
Mortensen Preben B.
Mowry Bryan J.
Muratori Roberto
Murphy Kieran C.
Murray Robin M.
Myin-Germeys Inez
Müller-Myhsok Bertram
Neale Benjamin M.
Nelis Mari
Nenadic Igor
Nertney Deborah A.
Nestadt Gerald
Nicodemus Kristin K.
Nikitina-Zake Liene
Nisenbaum Laura
Nordin Annelie
Noyan Handan
Nöthen Markus M.
Oh Sang-Yun
Olincy Ann
Olsen Line
Ophoff Roel A.
Os Jim Van
Owen Michael J.
O’Callaghan Eadbhard
O’Donovan Michael C.
O’Dushlaine Colm
O’Neill F. Anthony
O’Neill Francis A.
Palotie Aarno
Pantelis Christos
Papadimitriou George N.
Papiol Sergi
Pardiñas Antonio F.
Parkhomenko Elena
Pato Carlos N.
Pato Michele T.
Paunio Tiina
Pejovic-Milovancevic Milica
Periyasamy Sathish
Perkins Diana O.
Pers Tune H.
Petryshen Tracey L.
Pietiläinen Olli
Pignon Baptiste
Pimm Jonathan
Pocklington Andrew J.
Posthuma Danielle
Powell John
Price Alkes
Pulver Ann E.
Purcell Shaun M.
Quested Digby
Rasmussen Henrik B.
Reichenberg Abraham
Reimers Mark A.
Restellini Romeo
Richard Jean-Romain
Richards Alexander L.
Rietschel Marcella
Riley Brien P.
Ripke Stephan
Roffman Joshua L.
Roussos Panos
Ruderfer Douglas M.
Rujescu Dan
Salomaa Veikko
Sanders Alan R.
Schall Ulrich
Schubert Christian R.
Schulze Thomas G.
Schwab Sibylle G.
Schürhoff Franck
Scolnick Edward M.
Scott Rodney J.
Seidman Larry J.
Sham Pak C.
Shi Jianxin
Sigurdsson Engilbert
Silagadze Teimuraz
Silverman Jeremy M.
Sim Kang
Simonsen Carmen
Sklar Pamela
Slominsky Petr
Smart Sophie E.
Smoller Jordan W.
So Hon-Cheong
Spencer Chris C. A.
St Clair David
Stahl Daniel
Stahl Eli A.
Stefansson Hreinn
Stefansson Kari
Steinberg Stacy
Stogmann Elisabeth
Straub Richard E.
Strengman Eric
Strohmaier Jana
Stroup T. Scott
Subramaniam Mythily
Sullivan Patrick F.
Suvisaari Jaana
Svrakic Dragan M.
Szatkiewicz Jin P.
Szöke Andrei
Söderman Erik
Tarricone Ilaria
Thirumalai Srinivas
Toncheva Draga
Tooney Paul
Tortelli Andrea
Tosato Sarah
Veijola Juha
Visscher Peter M.
Vázquez-Bourgon Javier
Waddington John
Walsh Dermot
Walters James T. R.
Wang Dai
Wang Qiang
Webb Bradley T.
Weinberger Daniel R.
Weiser Mark
Wendland Jens R.
Werge Thomas
Wildenauer Dieter B.
Willcocks Isabella R.
Williams Nigel M.
Williams Stephanie
Witt Stephanie H.
Wolen Aaron R.
Wong Emily H. M.
Wormley Brandon K.
Wray Naomi R.
Wu Jing Qin
Xi Hualin Simon
Zai Clement C.
Zheng Xuebin
Zimprich Fritz
Üçok Alp
Španiel Filip
Publication venue: 'American Medical Association (AMA)'
Publication date: 01/01/2022
Field of study

Importance: About 20% to 30% of people with schizophrenia have psychotic symptoms that do not respond adequately to first-line antipsychotic treatment. This clinical presentation, chronic and highly disabling, is known as treatment-resistant schizophrenia (TRS). The causes of treatment resistance and their relationships with causes underlying schizophrenia are largely unknown. Adequately powered genetic studies of TRS are scarce because of the difficulty in collecting data from well-characterized TRS cohorts. Objective: To examine the genetic architecture of TRS through the reassessment of genetic data from schizophrenia studies and its validation in carefully ascertained clinical samples. Design, Setting, and Participants: Two case-control genome-wide association studies (GWASs) of schizophrenia were performed in which the case samples were defined as individuals with TRS (n = 10501) and individuals with non-TRS (n = 20325). The differences in effect sizes for allelic associations were then determined between both studies, the reasoning being such differences reflect treatment resistance instead of schizophrenia. Genotype data were retrieved from the CLOZUK and Psychiatric Genomics Consortium (PGC) schizophrenia studies. The output was validated using polygenic risk score (PRS) profiling of 2 independent schizophrenia cohorts with TRS and non-TRS: a prevalence sample with 817 individuals (Cardiff Cognition in Schizophrenia [CardiffCOGS]) and an incidence sample with 563 individuals (Genetics Workstream of the Schizophrenia Treatment Resistance and Therapeutic Advances [STRATA-G]). Main Outcomes and Measures: GWAS of treatment resistance in schizophrenia. The results of the GWAS were compared with complex polygenic traits through a genetic correlation approach and were used for PRS analysis on the independent validation cohorts using the same TRS definition. Results: The study included a total of 85490 participants (48635 [56.9%] male) in its GWAS stage and 1380 participants (859 [62.2%] male) in its PRS validation stage. Treatment resistance in schizophrenia emerged as a polygenic trait with detectable heritability (1% to 4%), and several traits related to intelligence and cognition were found to be genetically correlated with it (genetic correlation, 0.41-0.69). PRS analysis in the CardiffCOGS prevalence sample showed a positive association between TRS and a history of taking clozapine (r2 = 2.03%; P =.001), which was replicated in the STRATA-G incidence sample (r2 = 1.09%; P =.04). Conclusions and Relevance: In this GWAS, common genetic variants were differentially associated with TRS, and these associations may have been obscured through the amalgamation of large GWAS samples in previous studies of broadly defined schizophrenia. Findings of this study suggest the validity of meta-analytic approaches for studies on patient outcomes, including treatment resistance

Repository@Nottingham

Juelich Shared Electronic Resources

Automatic language identification using phoneme and automatically derived unit strings

Author: Igor Szöke
Pavel Matějka
Petr Schwarz
Publication venue: Springer, ISBN
Publication date: 01/01/2004
Field of study

Abstract. Language identification (LID) based on phono-tactic modeling is presented in this paper. Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM (EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The results show superiority of Czech phoneme recognizer while used in LID and promising trends using the EHMMderived units.

CiteSeerX

Phoneme based acoustics keyword spotting in informal continuous speech

Author: Igor Szöke
Martin Karafiát
Petr Schwarz
Publication venue
Publication date: 01/01/2005
Field of study

mixture (GM) hidden Markov modelling (HMM). Context-independent and dependent phoneme models are used in our system. The system was trained and evaluated on informal continuous speech. We used different complexities of KWS recognition networks and different types of phoneme models. The impact of these parameters on the accuracy and computational complexity is investigated.

CiteSeerX